What’s the plan?

In this part of the tutorial you will learn how to build multivariate hhh4 models. We will proceed as follows:

Repetition: Installing packages

We require the packages surveillance and hhh4addon, the latter containing some additional functionality for hhh4 models. It can be installed directly from GitHub using the remotes package (you may already have done that for the first part of the tutorial, but the necessary code is included here to make this document self-contained).

# load packages:
# options(warn = -1)
library(surveillance)
## Loading required package: sp
## Loading required package: xtable
## This is surveillance 1.20.0. For overview type 'help(surveillance)'.
# to install hhh4addon
# install.packages("remotes")
# library(remotes)
# remotes:::install_github("jbracher/hhh4addon", build_vignettes = TRUE)
library(hhh4addon)
## Loading required package: polyCub
## This is a development version of hhh4addon. It modifies and extends functionalities of surveillance:hhh4.
## 
## Attaching package: 'hhh4addon'
## The following object is masked from 'package:surveillance':
## 
##     decompose.hhh4

Example data

We will use two data sets provided with the hhh4addon package (original data from Robert Koch Institute). The noroBE data contain weekly counts confirmed of norovirus cases in the twelve districts of Berlin, 2011–2017. The rotaBE data contain the same for rotavirus. In this note the norovirus data are analysed, which as an exercise you will be able to apply to the rotavirus data.

# get the data
data("noroBE")
# if you don't have the hhh4addon package installed you can use:
# load(
# url("https://github.com/cmmid/hhh4-workshop/raw/main/example_noro_rota/data/noroBE.Rda")
# )

We start by plotting the temporal and spatial distribution of the data:

# look at data:
# time series:
plot(noroBE, ylim = c(0, 40))

# map:
# as population(noroBE) contains population fractions rather than raw
# population sizes setting population = 100,000/<population of Berlin>
# will yield cumulative incidence per 100,000 inhabitants; see ?stsplot_space
plot(noroBE, observed ~ unit, population = 100000/3500000, labels = TRUE)
## Warning in wkt(obj): CRS object has no comment

## Warning in wkt(obj): CRS object has no comment

We moreover take a look at the slot neighbourhood of the sts object. In this case it contains path distances between the districts: e.g., chwi and frkr have a distance of 2 as you need to go through mitt or scho.

# check the neighbourhood structure, which is stored in the @neighbourhood slot 
# of the sts object:
noroBE@neighbourhood
##      chwi frkr lich mahe mitt neuk pank rein span zehl scho trko
## chwi    0    2    3    4    1    2    2    1    1    1    1    3
## frkr    2    0    1    2    1    1    1    2    3    2    1    1
## lich    3    1    0    1    2    2    1    2    3    3    2    1
## mahe    4    2    1    0    3    2    2    3    4    4    3    1
## mitt    1    1    2    3    0    2    1    1    2    2    1    2
## neuk    2    1    2    2    2    0    2    3    3    2    1    1
## pank    2    1    1    2    1    2    0    1    2    3    2    2
## rein    1    2    2    3    1    3    1    0    1    2    2    3
## span    1    3    3    4    2    3    2    1    0    1    2    4
## zehl    1    2    3    4    2    2    3    2    1    0    1    3
## scho    1    1    2    3    1    1    2    2    2    1    0    2
## trko    3    1    1    1    2    1    2    3    4    3    2    0

Developing a multivariate model step by step

Step 1: Seasonality in the endemic component, all parameters shared.

We start by formulating a very simple model where for districts \(i = 1, \dots, K\) we assume \[\begin{align} Y_{it} \ \mid \ \ \text{past} & \sim \text{NegBin}(\mu_{it}, \psi)\\ \mu_{it} & = \nu_t + \lambda Y_{i, t - 1} \end{align}\] and \[ \nu_{t} = \alpha^\nu + \beta^\nu \sin(2\pi t/52) + \gamma^\nu \cos(2\pi t/52). \] Reminder: we refer to

  • \(\nu_t\) as the endemic component
  • \(\lambda Y_{i, t - 1}\) as the autoregressive or epidemic component

In a setting with just three districts the model could be illustrated as follows – case numbers the different districts are independent and only depend on the previous value in the same district.

In surveillance this model is specified as follows. Reminder: The hhh4 function takes the following arguments:

  • an stsobject containing the data
  • a control argument describing the model formulation. This list contains a number of named arguments, the most important ones being:
    • end: a specification of the endemic component (i.e., \(\nu\)).
    • ar: a specification of the autoregressive component (\(\lambda\); dependence on the previous value from the same district).
    • ne: a specification of the neighbourhood component (\(\phi\); dependence on on previous value from other districts, to be introduced later). All of end, ar and ne are lists containing a formula and possibly some more elements, see ?hhh4
    • family: the type of distribution used in the recursive model formulation (either "Poisson", "NegBin1", or "NegBinM").
    • subset: The subset of the time series to which the model shall be fit.
# first define a subset of the data to which to fit (will be used in all model
# fits to ensure comparability of AIC values):
subset_fit <- 6:(nrow(noroBE@observed) - 52)
# we are leaving out the last year and the first 5 observations
# the latter serves to ensure comparability to later model versions

##################################################################
# Model 1:
# control list:
ctrl1 <- list(end = list(f = addSeason2formula(~1, S = 1)), # seasonality in end
              # S = 1 means one pair of sine / cosine waves is included
              ar = list(f = ~ 1), # no seasonality in ar
              family = "NegBin1", # negative binomial (rather than Poisson)
              subset = subset_fit)
# fit model
fit1 <- hhh4(noroBE, ctrl1)
# summary of parameter estimates:
summary(fit1)
## 
## Call: 
## hhh4(stsObj = noroBE, control = ctrl1)
## 
## Coefficients:
##                         Estimate  Std. Error
## ar.1                    -0.72854   0.03453  
## end.1                    0.84477   0.02860  
## end.sin(2 * pi * t/52)   0.09024   0.02806  
## end.cos(2 * pi * t/52)   0.88574   0.02856  
## overdisp                 0.22957   0.01076  
## 
## Log-likelihood:   -8780.82 
## AIC:              17571.64 
## BIC:              17602.7 
## 
## Number of units:        12 
## Number of time points:  307
# you can set idx2Exp = TRUE to get all estimates on the exp-transformed scale
summary(fit1, idx2Exp = TRUE)
## 
## Call: 
## hhh4(stsObj = noroBE, control = ctrl1)
## 
## Coefficients:
##                              Estimate  Std. Error
## exp(ar.1)                    0.48261   0.01667   
## exp(end.1)                   2.32745   0.06657   
## exp(end.sin(2 * pi * t/52))  1.09443   0.03071   
## exp(end.cos(2 * pi * t/52))  2.42477   0.06925   
## overdisp                     0.22957   0.01076   
## 
## Log-likelihood:   -8780.82 
## AIC:              17571.64 
## BIC:              17602.7 
## 
## Number of units:        12 
## Number of time points:  307
# get AIC to compare model fit to more complex models
AIC(fit1)
## [1] 17571.64
# visually inspect:
par(mfcol = c(6, 2))
plot(fit1, unit = 1:12, par.settings = list(mfcol = c(6, 2))) # look at all units

# par.settings = list(mfcol = c(6, 2)) tells the function we want two columns
# of plots

The grey area represents the endemic component (\(\nu_{t}\)) of the fitted values with clearly visible seasonality. The blue area represents the autoregressive component.

For model inspection I like looking at Pearson residuals \[ \frac{Y_{it} - \hat{\mu}_{it}}{\hat{\sigma}_{it}} = \frac{Y_{it} - \hat{\mu}_{it}}{\sqrt{\hat{\mu}_{it} + \hat{\mu}_{it}^2/\psi}}, \] which are not currently implemented in surveillance, but can be computed using a little auxiliary function. For now I just look at their mean and variance per district.

# helper function to compute Pearson residuals:
pearson_residuals <- function(hhh4Obj){
  # compute raw residuals:
  response_residuals <- residuals(hhh4Obj, type = "response")
  # compute standard deviations:
  if(class(hhh4Obj) == "hhh4lag"){
    sds <- sqrt(fitted(hhh4Obj) + 
                  fitted(hhh4Obj)^2/hhh4addon:::psi2size.hhh4lag(hhh4Obj))
  }else{
    sds <- sqrt(fitted(hhh4Obj) + 
                  fitted(hhh4Obj)^2/surveillance:::psi2size.hhh4(hhh4Obj))
  }
  # compute Pearson residuals:
  pearson_residuals <- response_residuals/sds
  return(pearson_residuals)
}

pr1 <- pearson_residuals(fit1)
# compute district-wise means and standard deviations:
colMeans(pr1)
##         chwi         frkr         lich         mahe         mitt         neuk 
## -0.040128818 -0.354497174 -0.026596771 -0.149654100 -0.098957012 -0.139634691 
##         pank         rein         span         zehl         scho         trko 
##  0.363677331 -0.006394507 -0.203281407  0.422776136  0.208763434 -0.023204469
apply(pr1, 2, sd)
##      chwi      frkr      lich      mahe      mitt      neuk      pank      rein 
## 0.8969573 0.8394516 0.9984702 0.9348625 0.9594755 0.8998828 1.1386447 0.9955265 
##      span      zehl      scho      trko 
## 0.9606598 1.2538557 1.0589218 0.9730101

In each district, the Pearson residuals should have mean and standard deviations of approximately 0 and 1, respectively, which is clearly not the case. By sharing all parameters, fitted values are systematically larger than observed in some districts and smaller in others.

Adding district-specific parameters.

As the districts are of quite different sizes and have different levels of incidence, it seems reasonable to use district-specific parameters and extend the model to \[\begin{align} Y_{it} \ \mid \ \ \text{past} & \sim \text{NegBin}(\mu_{it}, \psi)\\ \mu_{it} & = \nu_{it} + \lambda_i Y_{i, t - 1}\\ \nu_{it} & = \alpha^\nu_i + \beta^\nu \sin(2\pi t/52) + \gamma^\nu \cos(2\pi t/52) \end{align}\]

##################################################################
# Model 2:
# We use fe(..., unitSpecific = TRUE) to add fixed effects for each unit,
# in this case intercepts (1). Seasonality parameters are still shared
# across districts
ctrl2 <- list(end = list(f = addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                               S = 1)),
              ar = list(f = ~ 0 + fe(1, unitSpecific = TRUE)),
              family = "NegBin1",
              subset = subset_fit)
# Note: ~0 is necessary as otherwise one would (implicitly) add two intercepts
# unit-specific dispersion parameters could be added by setting family = "NegBinM"
fit2 <- hhh4(noroBE, ctrl2)
# check parameter estimates:
summary(fit2)
## 
## Call: 
## hhh4(stsObj = noroBE, control = ctrl2)
## 
## Coefficients:
##                         Estimate   Std. Error
## ar.1.chwi               -1.795086   0.402460 
## ar.1.frkr               -1.259748   0.182296 
## ar.1.lich               -1.087080   0.177651 
## ar.1.mahe               -0.957952   0.145501 
## ar.1.mitt               -1.335034   0.214060 
## ar.1.neuk               -0.967629   0.159855 
## ar.1.pank               -0.753360   0.129017 
## ar.1.rein               -0.668191   0.103620 
## ar.1.span               -0.823339   0.125950 
## ar.1.zehl               -0.659794   0.095464 
## ar.1.scho               -0.865145   0.132413 
## ar.1.trko               -1.002428   0.165175 
## end.sin(2 * pi * t/52)   0.121489   0.024384 
## end.cos(2 * pi * t/52)   0.923900   0.024745 
## end.1.chwi               1.181472   0.088438 
## end.1.frkr               0.604251   0.081671 
## end.1.lich               1.020524   0.089385 
## end.1.mahe               0.791906   0.089509 
## end.1.mitt               1.010631   0.081603 
## end.1.neuk               0.823126   0.097049 
## end.1.pank               1.298509   0.104632 
## end.1.rein               0.791816   0.097056 
## end.1.span               0.648208   0.092789 
## end.1.zehl               1.297808   0.091415 
## end.1.scho               1.189950   0.093128 
## end.1.trko               0.997664   0.094204 
## overdisp                 0.197678   0.009884 
## 
## Log-likelihood:   -8655.46 
## AIC:              17364.92 
## BIC:              17532.63 
## 
## Number of units:        12 
## Number of time points:  307
# compute AIC
AIC(fit2)
## [1] 17364.92

We check the Pearson residuals again, which now look resonable in terms of mean and variance:

# compute Pearson residuals and check their mean and variance:
pr2 <- pearson_residuals(fit2)
colMeans(pr2)
##         chwi         frkr         lich         mahe         mitt         neuk 
##  0.020061368  0.025144794 -0.010413219  0.003481020 -0.004801394 -0.006139166 
##         pank         rein         span         zehl         scho         trko 
## -0.012998766 -0.004079849 -0.001710578 -0.022435249 -0.007614639 -0.016314666
apply(pr1, 2, sd)
##      chwi      frkr      lich      mahe      mitt      neuk      pank      rein 
## 0.8969573 0.8394516 0.9984702 0.9348625 0.9594755 0.8998828 1.1386447 0.9955265 
##      span      zehl      scho      trko 
## 0.9606598 1.2538557 1.0589218 0.9730101

Note that we could also add district-specific dispersion parameters \(\psi_i\) by setting family = "NegBinM", but for simplicity we stick with the shared overdispersion parameter.

Adding dependence between districts.

For now we have been modelling each district separately (while sharing some strength by pooling parameters). In a next step we add dependencies via the ne (neighbourhood) component. There are different ways of specifying the coupling between districts. The neighbourhood slot of the sts object contains the necessary information on the geographical disposition.

We now fit a model where only direct neighbours affect each other, setting \[\begin{align} Y_{it} \ \mid \ \ \text{past} & \sim \text{NegBin}(\mu_{it}, \psi)\\ \mu_{it} & = \nu_{it} + \lambda_i Y_{i, t - 1} + \phi_i \sum_{j \sim i} w_{ji} Y_{j, t - 1}. \end{align}\] Here, the relationship \(\sim\) indicates direct neighbourhood. The weights \(w_{ji}\) can be chosen in two ways.

  • if normalize == TRUE: \(w_{ji} = 1/\) number of neighbours of \(j\) if \(i \sim j\), 0 else
  • if normalize == FALSE: \(w_{ji} = 1\) if \(i \sim j\), 0 else

We usually use normalize = TRUE, which is based on the intuition that each district splits up its infectious pressure equally among its neighbours. Note that in this formulation, the autoregression on past incidence from the same district and others are handled each on their own with separate parameters.

In a simple graphical ilustration with just three districts where district 2 is neighbouring both 1 and 3, but 1 and 3 are not neighbours, the model looks as follows. The grey lines indicate that these dependencies are typically weaker than the ones to previous values from the same district.

# The default setting for the ne component is to use weights 
# neighbourhood(stsObj) == 1 (see ?hhh4).
neighbourhood(noroBE) == 1
##       chwi  frkr  lich  mahe  mitt  neuk  pank  rein  span  zehl  scho  trko
## chwi FALSE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE
## frkr FALSE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE
## lich FALSE  TRUE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE
## mahe FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## mitt  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE FALSE
## neuk FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
## pank FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
## rein  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE
## span  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE
## zehl  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE
## scho  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE
## trko FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
# this is because historically neighbourhood matrices were just binary
# however, the path distances are coded in a way that direct neighbours have 
# distance 1, meaning that there is no need to modify the neighbourhood matrix

##################################################################
# Model 3:
ctrl3 <- list(end = list(f = addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                               S = 1)),
              ar = list(f = ~ 0 + fe(1, unitSpecific = TRUE)),
              ne = list(f = ~ 0 + fe(1, unitSpecific = TRUE), normalize = TRUE), 
              #  now added ne component to reflect cross-district dependencies
              family = "NegBin1",
              subset = subset_fit)
# normalize = TRUE normalizes weights by the number of neighbours of the 
# exporting district
fit3 <- hhh4(noroBE, ctrl3)

# alternative: use update
# fit3 <- update(fit2, ne = list(f = ~ 0 + fe(1, unitSpecific = TRUE),
#                                weights = neighbourhood(noroBE) == 1,  # little bug?
#                                normalize = TRUE))
summary(fit3) # parameters for different districts are quite different
## 
## Call: 
## hhh4(stsObj = noroBE, control = ctrl3)
## 
## Coefficients:
##                         Estimate   Std. Error
## ar.1.chwi               -2.984424   1.350606 
## ar.1.frkr               -1.546018   0.255272 
## ar.1.lich               -1.481704   0.269704 
## ar.1.mahe               -1.042180   0.165513 
## ar.1.mitt               -1.684145   0.325195 
## ar.1.neuk               -1.178880   0.203516 
## ar.1.pank               -0.950771   0.162474 
## ar.1.rein               -0.739190   0.113666 
## ar.1.span               -1.030189   0.159685 
## ar.1.zehl               -0.700978   0.109133 
## ar.1.scho               -1.140107   0.188365 
## ar.1.trko               -0.954548   0.156077 
## ne.1.chwi               -1.556838   0.229854 
## ne.1.frkr               -2.105432   0.303360 
## ne.1.lich               -1.333358   0.275089 
## ne.1.mahe               -2.082596   1.001100 
## ne.1.mitt               -1.382118   0.285452 
## ne.1.neuk               -1.005697   0.308230 
## ne.1.pank               -0.594043   0.265078 
## ne.1.rein               -1.531222   0.365439 
## ne.1.span               -1.496264   0.237766 
## ne.1.zehl               -1.197674   0.519991 
## ne.1.scho               -1.174415   0.268474 
## ne.1.trko               -2.805666   1.088384 
## end.sin(2 * pi * t/52)   0.013954   0.037300 
## end.cos(2 * pi * t/52)   0.918633   0.033880 
## end.1.chwi               0.817175   0.143907 
## end.1.frkr               0.142450   0.200990 
## end.1.lich               0.636253   0.176280 
## end.1.mahe               0.715465   0.135111 
## end.1.mitt               0.541823   0.187192 
## end.1.neuk               0.444922   0.175794 
## end.1.pank               0.867238   0.188664 
## end.1.rein               0.349285   0.222273 
## end.1.span               0.143326   0.183118 
## end.1.zehl               1.068714   0.140279 
## end.1.scho               0.730291   0.182337 
## end.1.trko               0.866186   0.143240 
## overdisp                 0.182920   0.009464 
## 
## Log-likelihood:   -8596.56 
## AIC:              17271.12 
## BIC:              17513.37 
## 
## Number of units:        12 
## Number of time points:  307
AIC(fit3)
## [1] 17271.12

Plotting the model fits, we see that for certain districts the neighbourhood component (orange) is very important (chwi), for others negligible (trko). This may be due to poor identifiability (after all, the autoregressive terms from the same and other districts are strongly correlated).

plot(fit3, unit = 1:12, par.settings = list(mfcol = c(6, 2)))

Adding a power law for spatial dependence.

The above formulation requires a lot of parameters as the autoregressions on the same and other regions are handled separately, and only takes into account first neighbours. A more parsimonious model also alowing for dependencies between indirect neighbours is a power-law formulation, where we set \[\begin{align} \mu_{it} & = \nu_{it} + \phi_i \sum_{j = 1}^K w_{ji} Y_{j, t - 1} \end{align}\] with weights \(w_{ji}\) defined as \[ w_{ji} = \begin{cases} (o_{ji} + 1)^{-\rho} & \text{ if } \text{normalize == FALSE}\\ \frac{(o_{ji} + 1)^{-\rho}}{\sum_{i = 1}^K (o_{jk} + 1)^{-\rho}} & \text{ if } \text{normalize == TRUE} \end{cases}, \] where \(o_{ji}\) is the path distance between districts \(j\) and \(i\).

Returning to our simple illustration with three districts, this step adds some additional connections between indirect neighbours (districts 1 and 3). They are shown in light gray as by construction of the model they are weaker than those between neighbouring districts.

##################################################################
# Model 4

# For the next model version we will formally incorporate the autoregressive
# copoenent into the neighbourhood component
# (i.e. do no longer treat autoregression on the same district separately). 
# This can be done as follows.
# First we need to adapt the neighbourhood matrix, shifting by one
noroBE_power <- noroBE
noroBE_power@neighbourhood <- noroBE@neighbourhood + 1

# new control argument:
ctrl4 <- list(end = list(f = addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                               S = 1)),
              # note: line ar = ... is removed
              ne = list(f = ~0 + fe(1, unitSpecific = TRUE),
                        weights = W_powerlaw(maxlag=5, normalize = TRUE,
                                             log = TRUE)), # this is new
              family = "NegBin1",
              subset = subset_fit)
# normalize = TRUE normalizes weights by the number of neighbours of the
# exporting district
# log = TRUE means optimization will be done on a log scale, ensuring 
# positivity of the decay parameter (which is desirable)
fit4 <- hhh4(noroBE_power, ctrl4)
AIC(fit4)
## [1] 17233.06

This yields a quite drastic improvement in AIC. We can visualize the weights \(w_{ji}\) as a function of the path dependence \(o_{ji}\)

# visualize the nighbourhood weights:
plot(fit4, type = "neweights", main = "Weights by path distance\n (note: 1 is same region and 2 is direct neighbours here)")

When plotting the model fit we note that there is now no autoregressive component (blue).

plot(fit4, unit = 1:12, par.settings = list(mfcol = c(6, 2)))

# note there is now no autoregressive component in the fit plots

Adding seasonality to the epidemic component.

Including seasonality in the ar and ne components when both are present can lead to identifiability issues. In the more parsimonious model where only the ne component is present, however, including seasonal terms here can considerably improve the model fit. This is indeed the case in our example.

##################################################################
# Model 5:
ctrl5 <- list(end = list(f = addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                               S = 1)),
              # now adding seasonality to the ne component:
              ne = list(f =  addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                               S = 1),
                        weights = W_powerlaw(maxlag=5, normalize = TRUE,
                                             log=TRUE)), # this is new
              family = "NegBin1",
              subset = subset_fit)
fit5 <- hhh4(noroBE_power, ctrl5)
AIC(fit5)
## [1] 17205.41

We also consider the autocorrelation of the Pearson residuals. For several regions there are pronounced residual autocorrelations at lags 2 or 3, indicating that there is additional information to be exploited.

# compute Pearson residuals:
pr5 <- pearson_residuals(fit4)
par(mfrow = c(3, 4))
for(unit in colnames(pr5)){
  acf(pr5[, unit], lag.max = 5, ylim = c(-0.3, 0.3), main = unit)
}

Adding higher-order lags.

The hhh4addon package contains functionality to add higher-order lags to the endemic-epidemic model,

\[ \mu_{it} = \nu_{it} + \phi_i \sum_{j = 1}^K \sum_{d = 1}^D u_d w_{ji} Y_{j, t - 1}. \]

Lag weights are (usually) governed by a single parameter, which is fed into a function returning the weights. The package contains the several parameterizations:

  • geometric: \(\tilde{u}_d = \alpha(1 - \alpha)^{d - 1}\)
  • Poisson: \(\tilde{u}_d = \alpha^{d - 1}\exp(-\alpha)/(d - 1)!\)

Before entering into the lag weights are standardized so they sum up to one, setting \(u_d = \tilde{u}_d / \sum_{i = 1}^D \tilde{u}_{i}\).

Returning a last time to our toy illustration of three districts, this step adds a whole lot of (typically week) connections between non-neighbouring time points.

# check out examples of the different lag types:
# geometric:
(g <- geometric_lag(par_lag = 1.5, min_lag = 1, max_lag = 5))
## [1] 0.8177396888 0.1491765911 0.0272136178 0.0049644585 0.0009056439
# first weight corresponds to exp(1.5)/(1 + exp(1.5))
# 5 is also the default number of lags

# Poisson:
(p <- poisson_lag(par_lag = 0.8, min_lag = 1, max_lag = 5))
## [1] 0.1168028 0.2599493 0.2892639 0.2145896 0.1193945
# weights correspond to dpois(0:4, exp(0.8))/sum(dpois(0:4, exp(0.8)))

par(mfrow = 1:2)
plot(g, xlab = "lag", ylab = "weight", ylim = 0:1, type = "h", 
     main = "Geometric weights")
plot(p, xlab = "lag", ylab = "weight", ylim = 0:1, type = "h", 
     main = "Poisson weights")

# A two-point distribution and a triangular distribution are also available.
# Users can also provide their own weighting functions.

The parameterizations of these functions are chosen such that any value from the real line can be provided to them, which will facilitate optimization in a next step

The lag weihts can be pre-specified, but most of the time we will estimate them from the data rather than pre-specifying them. This can be done via the (poorly named) function profile_par_lag. We use geometric lags (which is also the default).

##################################################################
# Model 7
ctrl7 <- list(end = list(f = addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                               S = 1)),
              # note: line ar = ... is removed
              ne = list(f = addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                              S = 1),
                        weights = W_powerlaw(maxlag=5, normalize = TRUE,
                                             log=TRUE)), # this is new
              family = "NegBin1",
              subset = subset_fit,
              # (no specification of par_lag in the control)
              funct_lag = geometric_lag)
# now use profile_par_lag (applies a profile likelihood procedure to estimate
# the lag decay parameter)
fit7 <- profile_par_lag(noroBE_power, ctrl6)
AIC(fit7)
## [1] 17138.12

We can plot the weights as estimated from the data:

# plot the weights assigned to the different lags:
par(mfrow = 1:2)
plot(fit7$distr_lag, type = "h", xlab  = "lag",
     ylab = "weight", ylim = 0:1)

And consider the Pearson residuals, which look less problematic than before (at least that’s what I like to tell myself).

# check Pearson residuals
pr7 <- pearson_residuals(fit6)
par(mfrow = c(3, 4))
for(unit in colnames(pr7)){
  acf(pr7[, unit], lag.max = 5, ylim = c(-0.3, 0.3), main = unit)
}

One-step-ahead forecasting

The packages contain functionality to imitate one-step-ahead (out-of-sample) forecasts retrospectively. To this end, the model is sequentially re-fitted, always including all data up to a given week. Then a one-week-ahead plug-in prediciton is obtained for the next week. We apply this to obtain predictions for weeks 313 through 364 which we had excluded from model fitting so far. Subsequently, you can use scores to evaluate the one-step-ahead predictions using several different statistical metrics. We compute the following:

For both lower values` are better.

##################################################################
# one-step-ahead forecasting: generate forecasts sequentially
# compare models 2, 4 and 7
log2 <- capture.output(owa2 <- oneStepAhead(fit2, tp = c(312, 363)))
log4 <- capture.output(owa4 <- oneStepAhead(fit4, tp = c(312, 363)))
rm(log2, log4)
# the weird capture.output formulation is needed to suppress 
# numerous cat() messages.
# you could also just use
# owa2 <- oneStepAhead(fit2, tp = c(312, 363))
owa7 <- oneStepAhead_hhh4lag(fit7, tp = c(312, 363))
## Lag weights are not re-estimated for the one-step-ahead forecasts (set refit_par_lag = TRUE to re-fit them).
# the return objects contain predictions, observations and a few other things:
head(owa7$pred)
##          chwi     frkr      lich     mahe      mitt      neuk     pank     rein
## 313 10.143891 5.997015  8.315079 7.550797 12.276676  9.072282 18.06990 12.35556
## 314 10.052363 6.133326  9.568715 7.025105 12.332588 11.455435 20.52966 10.66938
## 315 10.837813 6.638302 10.085745 8.250180 10.817444 10.246092 22.11388 17.05596
## 316  9.737658 6.614877 16.126879 8.251785  8.842089  8.346345 24.43109 14.89341
## 317  8.017484 5.761749 10.387913 6.531485  6.529512  7.030808 20.62327 11.56332
## 318  9.328702 7.435577 13.008535 6.949797  8.054874  9.280143 20.39342 11.20380
##          span     zehl     scho     trko
## 313 13.161304 20.62583 19.19221 7.951694
## 314 10.103431 23.18901 18.21469 8.461257
## 315  8.692397 24.61323 21.20857 8.030405
## 316  7.391036 19.24499 17.81345 7.817094
## 317 10.879235 19.91117 14.02415 6.814946
## 318 11.205468 19.68513 16.13912 8.639782
head(owa7$observed)
##     chwi frkr lich mahe mitt neuk pank rein span zehl scho trko
## 313   10    4   10    5   17   16   23    5    7   24   17    5
## 314   12    8    9   10    6    8   23   26    3   24   28    3
## 315   11    7   31    7    3    5   28   14    4   14   18    2
## 316    5    8    6    5    0    5   22    8   18   20   12    4
## 317   13   16   20    6    7   11   20    8   14   17   18    6
## 318    7    8    7    5    3   16    9   20   12   22   31   11
# plot one-step-ahead point forecasts:
plot(noroBE@observed[313:364, 1], pch = 20, xlab = "week", ylab = "value")
lines(owa2$pred[, 1], type = "l", col = 2)
lines(owa4$pred[, 1], type = "l", col = 4)
lines(owa7$pred[, 1], type = "l", col = 6)

# compute and summarize scores:
colMeans(scores(owa2, which = c("logs", "rps")))
##     logs      rps 
## 2.506608 2.091024
colMeans(scores(owa4, which = c("logs", "rps")))
##     logs      rps 
## 2.492431 2.066390
colMeans(scores(owa7, which = c("logs", "rps")))
##     logs      rps 
## 2.475311 2.042008

We can see that the more complex model formulations also translate to improved predictive performance (all scores being negatively oriented; see ?scores).

Predictive moments at longer forecast horizons.

The hhh4addon package also contains functions to compute predictive moments (means, variances, autocorrelations) for longer time horizons as well as marginal/stationary moments of the fitted model.

##################################################################
# longer-term predictive moments can be computed using predictive_moments:
# predictive moments 10 weeks ahead:
pred_mom7 <- predictive_moments(fit7, t_condition = max(subset_fit), lgt = 10)
# print some predictive means:
head(pred_mom7$mu_matrix[, 1:6])
##            chwi     frkr      lich     mahe      mitt     neuk
## t=313 10.143891 5.997015  8.315079 7.550797 12.276676 9.072282
## t=314 10.256036 6.200316  9.202886 7.794306 11.403386 9.483784
## t=315 10.224376 6.285932  9.759873 7.944234 10.889366 9.757613
## t=316 10.077085 6.281429 10.044002 7.998614 10.518741 9.877621
## t=317  9.833192 6.199311 10.132328 7.930412 10.185663 9.836843
## t=318  9.529173 6.051469 10.026521 7.766129  9.843626 9.705029
# plot:
plot(fit7)
fanplot_prediction(pred_mom7, add = TRUE,
                   probs = c(0.05, 0.15, 0.25, 0.75, 0.85, 0.95),
                   pt.col = "black")

# note: the plot is based on a negative binomial approximation of the 
# predictive distributions.

# stationary/marginal moments are implemented, too (but don't always exist):
stat7 <- stationary_moments(fit6)
fanplot_stationary(stat7, unit = 4)

Over to you…

You are now invited to perform an analysis of rotavirus in Berlin along the lines of the above. The data, which are structured the same way as the norovirus data. Your “goal” is to formulate a model which will generate good one-step-ahead forecasts (we’ll use logS as the main outcome, which is the so-called logarithmic score, equivalent to the negative predictive log-likelihood). You can load the data via

data("rotaBE")
# if you don't have the hhh4addon package installed you can use:
# load(
# url("https://github.com/cmmid/hhh4-workshop/raw/main/example_noro_rota/data/rotaBE.Rda")
# )

Your code should look somewhat like the following:

ctrl <- list(end = list(f = addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                               S = 1)),
              ar = list(f = ~ 0 + fe(1, unitSpecific = TRUE)),
              family = "NegBin1",
              subset = 6:312)
your_fit <- hhh4(rotaBE, ctrl)
# owa <- oneStepAhead(your_fit, tp = c(312, 363))
# or: owa <- oneStepAhead_hhh4(your_fit, tp = c(312, 363))
# if you have used profile_par_lag 
# colMeans(scores(owa), which = c("logs", "rps"))
#     logs       rps
# 1.931189  1.639351

If you like you can post your results here (https://github.com/cmmid/hhh4-workshop/issues/1) or on the chat.

Here are some model aspects you can play around with:

# get temperature data:
data_temperature <-
  read.csv(paste0("https://raw.githubusercontent.com/cmmid/hhh4-workshop/", 
                  "main/example_noro_rota/temperature_berlin.csv"))
temperature <- data_temperature$temperature7d
# your formula could look as follows:
ctrl <- list(end = list(f = addSeason2formula(~0 + fe(1, unitSpecific = TRUE),
                                              S = 1)),
             ar = list(f = ~ 0 + temperature + fe(1, unitSpecific = TRUE)),
             family = "NegBin1",
             subset = 6:312)
# though that is not necessarily a very smart way of using the covariate